home *** CD-ROM | disk | FTP | other *** search
- In article <23177A@erik.naggum.no> you write:
- >Dan Connolly <connolly@convex.com> writes:
- >|
- >| The WWW group is attempting to define a multimedia interchange
- >| format called HTML. . . .
- >
- >Why not use HyTime?
- >
- Eric:
- Partyly because of ignorance (we've heard of HyTime, but we don't
- know the details). I'd expect a HYTIME engine to be quite a bit
- of work to implement. And partly because, as I understand it, HYTIME
- doesn't go as far as to perscribe a DTD. The WWW project needs
- one particluar language, not a whole architecture.
-
- I'd certainly like to know more about HYTIME's techniques for addressing
- documents, esp. elements of documents.
-
- Now for the WWW gang:
- >:
- >| That is, is it possible to put an arbitrary 8 bit binary stream
- >| _inside_ an SGML document? My guess is: no. But if we use
- >| CDATA, can we include anything that doesn't contain the closing
- >| tag in full?
- >
- >If you by "the closing tag in full" mean the entire end-tag, complete
- >with etago, generic identifier, and tagc, as in "</image>", this is not
- >the way SGML does it. CDATA and SDATA are terminated by a etago
- >"delimiter-in-context", which is an etago (end-tag open, "</") delimiter
- >followed by a name start character, or a grpo (group open, "(")
- >delimiter if concurrent document types are allowed. In the reference
- >concrete syntax, this means that the regular expression "</[(a-z]"
- >matches the end of CDATA and SDATA elements.
- >
- >You can also use marked sections, with a CDATA status keyword, in which
- >case the CDATA is terminated by the mse delimiter (marked section end,
- >"]]>").
- >
- >:
- >| Someone made the point that an SGML document is only allowed to
- >| include SGML characters as specified by the SGML declaration, and if
- >| we're going to use the default SGML declaration, we have to stick to
- >| the characters blessed by it.
- >
- >Blessed and blessed. The SGML declaration is supposed to reflect the
- >reality of the document, not enforce arbitrary limits on them. So you
- >write an SGML declaration which fits the document.
- >
- >| That's not my understanding. I thought that inside CDATA (or SDATA,
- >| I think) you could put _anything_ but the closing tag in full.
- >
- >As said above, the etago delimiter-in-context terminates the data,
- >regardless of whether it's a legal end-tag in that context.
- >
- >You should be aware that the SGML parser will parse the contents of the
- >"binary" content, and ignore record start, and treat record ends
- >different from other characters. In addition, it's an error for an SGML
- >entity to contain characters with any of the numbers listed in the
- >SHUNCHAR part of the SYNTAX declaration. This is _not_ what you want
- >with binary data.
- >
- >| What's the scoop? Do we have to use external entities for raw data?
- >
- >Yes. An external entity that is not an SGML text entity requires a
- >notation identifier, so you only need to list the entities in the DTD,
- >with notation, and refer to them by name in the document instance.
- >
- >If this is not satisfactory, you should declare the objects to be CDATA,
- >and use a binary to text-only transformation scheme. There are several
- >such schemes. Among them, base64 is the preferred encoding in my view,
- >since it's available as part of the new Multipurpose Internet Mail
- >Extensions (MIME) RFC-to-be. (The latest draft is available for
- >anonymous FTP as ftp.ifi.uio.no:/pub/SGML/MIME.6.ps and MIME.6.txt for
- >two weeks from today. Section 5.2 which concerns the base64 encoding is
- >also available as ftp.ifi.uio.no:/pub/SGML/base64.txt.) Transformation
- >back to the binary form from the text-only form may be done on the fly
- >by the application before sending the data to the notation interpreter.
- >
- My idea is to use MIME encodings, but put these attachments _outside_
- the SGML text, in an attached (or external) body part.
-
- >In addition to being much easier to deal with in SGML, this also makes
- >SGML documents containing such content robust with respect to file
- >transfer, etc.
- >
- >Hope this helps,
- ></Erik>
-
- Thanks. Mostly it confirms my suspicions, but it should also provide
- a somewhat authoritative answer (no references to ISO 8879 here :-)
- to the WWW project.
-
- >--
- >Erik Naggum | +47-295-0313 | ISO 8879 SGML | Memento,
- >Naggum Software | "fuzzface" | ISO 10744 HyTime | terrigena.
- >Boks 1570, Vika | <erik@naggum.no> | JTC 1/SC 18/WG 8 | Memento,
- >0118 OSLO, NORWAY | <enag@ifi.uio.no> | SGML UG SIGhyper | vita brevis.
-
-
-
-